Multi-Dimensional Regression Analysis of Time-Series Data Streams

نویسندگان

  • Yixin Chen
  • Guozhu Dong
  • Jiawei Han
  • Benjamin W. Wah
  • Jianyong Wang
چکیده

Real-time production systems and other dynamic environments often generate tremendous (potentially in nite) amount of stream data; the volume of data is too huge to be stored on disks or scanned multiple times. Can we perform on-line, multi-dimensional analysis and data mining of such data to alert people about dramatic changes of situations and to initiate timely, high-quality responses? This is a challenging task. In this paper, we investigate methods for online, multi-dimensional regression analysis of time-series stream data, with the following contributions: (1) our analysis shows that only a small number of compressed regression measures instead of the complete stream of data need to be registered for multi-dimensional linear regression analysis, (2) to facilitate on-line stream data analysis, a partially materialized data cube model, with regression as measure, and a tilt time frame as its time dimension, is proposed to minimize the amount of data to be retained in memory or stored on disks, and (3) an exception-guided drilling approach is developed for on-line, multi-dimensional exceptionbased regression analysis. Based on this design, algorithms are proposed for eÆcient analysis of time-series data streams. Our performance study compares the proposed algorithms and identi es the most memoryand timeeÆcient The work was supported in part by grants from U.S. National Science Foundation, the University of Illinois, and Microsoft Research. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment. Proceedings of the 28th VLDB Conference, Hong Kong, China, 2002 one for multi-dimensional stream data analysis.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Analytical D’Alembert Series Solution for Multi-Layered One-Dimensional Elastic Wave Propagation with the Use of General Dirichlet Series

A general initial-boundary value problem of one-dimensional transient wave propagation in a multi-layered elastic medium due to arbitrary boundary or interface excitations (either prescribed tractions or displacements) is considered. Laplace transformation technique is utilised and the Laplace transform inversion is facilitated via an unconventional method, where the expansion of complex-valued...

متن کامل

StreamFitter: A Real Time Linear Regression Analysis System for Continuous Data Streams

In this demo, we present the StreamFitter system for real-time regression analysis on continuous data streams. In order to perform regression on data streams, it is necessary to continuously update the regression model parameters while receiving new data. In this demo, we will present two approaches for on-line, multi-dimensional linear regression analysis of stream data, namely Incremental Mat...

متن کامل

Application of Markov-Chain Analysis and Stirred Tanks in Series Model in Mathematical Modeling of Impinging Streams Dryers

In spite of the fact that the principles of impinging stream reactors have been developed for more than half a century, the performance analysis of such devices, from the viewpoint of the mathematical modeling, has not been investigated extensively. In this study two mathematical models were proposed to describe particulate matter drying in tangential impinging stream dryers. The models were de...

متن کامل

Cypress: Managing Massive Time Series Streams with Multi-Scale Compressed Trickles

We present Cypress, a novel framework to archive and query massive time series streams such as those generated by sensor networks, data centers, and scientific computing. Cypress applies multi-scale analysis to decompose time series and to obtain sparse representations in various domains (e.g. frequency domain and time domain). Relying on the sparsity, the time series streams can be archived wi...

متن کامل

3D-QSAR and docking analysis on a series of multi-cyclin-dependent kinase inhibitors using CoMFA, and CoMSIA

A series of 42 Pyrazolo[4,3-h]quinazoline-3-carboxamides as multi-cyclin-dependent kinaseinhibitors regarded as promising antitumor agents to complement the existing therapies, wassubjected to a three-dimensional quantitative activity relationship (3D QSAR). Different QSARmethods, comparative molecular field analysis (CoMFA), CoMFA region focusing, andcomparative molecular similarity indices an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002